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1.  Introduction 


In  the  ever-increasing  realm  of  “high-tech”  Soldier  systems,  one  factor  remains  fairly  constant:  the 
human  factor.  The  use  of  multiple  high-tech  and  increasingly  complex  systems  is  intended  to  add 
capabilities  to  Soldiers  and  in  many  cases  to  reduce  stress  and  workload.  However,  these  systems 
may  well  add  increased  levels  of  stress  and  workload  onto  Soldiers  who  are  already  at  heightened 
levels  of  each  because  of  the  particular  environments  in  which  these  systems  are  employed.  As  just 
one  example,  Dixon  and  Wickens  (2006)  found  that  diagnostic  automation  to  assist  unmanned  aerial 
vehicle  operators,  when  operated  at  less  than  80%  accuracy,  resulted  in  more  workload  compared  to 
no  automation  at  all.  In  order  to  gauge  what  levels  of  stress  and  workload  are  being  impinged  upon 
these  Soldiers,  researchers  and  materiel  developers  have  used  a  small  number  of  tools  at  their  dis¬ 
posal.  The  two  primary  tools  used  are  self-report  surveys  and  salivary  amylase.  Surveys  are  quick 
and  cheap  but  subjective,  while  salivary  amylase  tests  are  objective  but  time  consuming,  intrusive, 
and  expensive.  As  requirements  increase  to  incorporate  larger  numbers  of  high-tech  and  more  com¬ 
plex  systems  with  Soldier-in-the-loop  (SIL)  systems,  researchers  will  need  a  method  to  gather  stress 
data  in  an  accurate,  timely,  and  less  intrusive  manner. 

This  report  discusses  the  use  of  a  third  method  to  measure  Soldier  stress:  galvanic  skin  response 
(GSR).  The  first  step  of  this  evolutionary  process  compared  the  survey  method  with  the  GSR 
method  in  an  attempt  to  determine  if  GSR  data  are  similar  to  survey  stress  data  in  terms  of  statistics 
and  trends.  The  theory  is  that  if  these  data  are  similar  in  the  same  experimental  circumstances,  this 
would  provide  impetus  to  pursue  further  research  among  all  three  methods:  survey,  GSR,  and 
salivary  amylase. 

The  ultimate  goal  of  this  research  (this  phase  and  ensuing  research)  is  to  determine  if  the  GSR 
method  is  a  suitable  “middle  ground”  between  the  survey  method,  which  is  subjective  and  some¬ 
what  intrusive,  and  the  salivary  amylase  method,  which  is  very  time  consuming,  costly,  and  in¬ 
trusive.  GSR  has  the  potential  to  provide  researchers  with  a  tool  for  objectively  measuring  Soldier 
stress  that  is  quick,  effective,  and  unobtrusive  during  research,  training,  and  operational  conditions. 
Discussion  includes  results  of  the  survey-GSR  comparison  and  recommendations  for  ensuing 
research  to  examine  the  differences  among  all  three  methods. 


2.  Stress  Measurement  Methods 


2,1  Survey  Method 

The  primary  tool  used  in  the  field  has  been  validated  stress  surveys  or  questionnaires.  Surveys 
provide  a  standardized  method  of  soliciting  how  stressful  a  Soldier  feels  in  a  given  circumstance  or 
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situation  and  are  a  fairly  easy  and  “low-tech”  means  to  gather  data.  The  primary  disadvantage 
of  surveys  is  that  they  are  a  subjective  assessment  of  stress.  This  means  the  data  are  subject  to 
the  perceptual  and  cognitive  biases  inherent  in  all  of  us.  The  method  for  controlling  (but  not 
eliminating)  such  bias  is  to  take  a  large  sample  of  repeated  measures  of  stress  during  a  given 
experiment  or  event. 

The  stress  survey  used  in  this  study  was  a  simple  pen-and-paper  questionnaire,  using  a  Likert-type 
scale  measuring  both  physical  and  mental  stress  (appendix  A).  This  survey  was  developed  and  has 
been  used  extensively  by  the  U.S.  Army  Research  Laboratory’s  (ARL’s)  Human  Research  and 
Engineering  Directorate  in  field  and  laboratory  environments  (Keryl  &  Bialek,  1958;  Perala,  2005; 
Sterling  &  Jacobson,  2006;  Perala,  Sterling,  Scheiner,  &  Butler,  2007).  Although  quick  and  rela¬ 
tively  easy  to  administer,  the  survey  requires  that  after  an  experimental  trial,  participants  must 
recall  events  and  aspects  of  the  trial  and  then  rate  (on  a  scale  of  1  to  10)  their  perceived  levels  of 
stress  in  relation  to  those  events  and  aspects  of  the  trial.  This  is  conducted  at  the  end  of  every  trial, 
and  the  experimenter  collates  the  data  collected  from  all  participants  for  use  in  later  statistical 
analyses. 

2.2  Continuous  Monitoring  Method 

One  proposed  method  to  collect  continuous  and  objective  data  in  a  non-intrusive  manner  is  by  the 
use  of  GSR.  Also  known  as  electrodermal  response,  psychogalvanic  reflex,  or  skin  conductance 
response,  GSR  is  a  method  of  capturing  the  autonomic  nerve  response  as  a  parameter  of  the  sweat 
gland  function  (i.e.,  measuring  the  electrical  resistance  of  the  skin).  As  stress  levels  increase, 
changes  in  the  electrical  resistance  of  the  skin  are  detected  by  GSR  sensors.  This  method  of  nerve 
response  detection  is  very  similar  to  that  used  in  modern  polygraph  tests. 

GSR  has  long  been  considered  a  measure  of  physiological  and  mental  stress  (Fenz  &  Epstein, 
1967).  Although  there  are  no  absolute  levels  of  GSR  indicative  of  high  workload  or  stress,  GSR  is 
a  good  relative  indicator  of  stress.  That  is,  higher  GSR  levels  recorded  during  certain  tasks  suggest 
higher  levels  of  stress.  One  caveat  to  GSR  is  that  although  there  is  a  relationship  between 
sympathetic  activity  and  emotional  arousal,  determining  the  specific  emotion  being  elicited  is 
difficult.  For  example,  fear,  anger,  startle  response,  orienting  response  and  sexual  feelings  are  all 
among  the  emotions  that  may  produce  similar  GSR  responses.  However,  controlling  for  these 
extraneous  emotions  may  assist  in  parsing  output  into  meaningful  results.  A  more  objective  way 
to  determine  whether  GSR  is  measuring  stress  (versus  some  other  emotion)  is  to  compare  GSR 
data  with  other,  known  stress  data.  For  example,  comparing  GSR  data  with  survey  stress  data  and 
salivary  amylase  stress  data  may  provide  sufficient  evidence  that  GSR  data  captured  during  the 
same  experimental  conditions  is  actually  measuring  stress. 
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2.2.1  Bio-instrumentation  Armband 

The  particular  method  used  to  collect  GSR  data  for  this  report  was  via  a  small,  lightweight,  un¬ 
obtrusive  body  monitor,  called  the  SenseWear1  Pro2  armband  by  BodyMedia,  Inc.  (Comparable 
products  are  available  which  measure  similar  autonomic  functions.)  The  armband  is  worn  on  the 
back  of  the  upper  arm,  which  enables  continuous  physiological  data  collection  outside  a  laboratory 
environment  (see  figure  1).  Using  metallic  sensors  close  to  the  skin  (see  figure  2),  the  armband 
(as  it  is  referred  to  throughout  the  text)  collects  biorhythmic  data  in  real  time,  with  a  configurable 
sample  rate,  and  gathers  raw  physiological  data  such  as  movement,  heat  flow,  skin  temperature, 
ambient  temperature,  and  galvanic  skin  response.  The  armband  may  be  worn  for  as  many  as  14 
days  continuously  with  the  same  internal  battery  and  can  store  as  many  as  14  days  (depending  on 
the  sample  rate)  of  continuous  physiological  data.  A  data  time  stamp  feature  allowed  the  research¬ 
er  to  mark  specific  events  in  the  data  to  facilitate  later  data  analysis.  The  device  is  designed  to 
provide  auditory  and  tactile  feedback  during  certain  events;  however,  this  feature  was  altered 
(through  firmware  modification)  for  this  research,  so  this  feedback  did  not  interfere  with  the 
experimentation.  Armband  specifics  are  shown  in  appendix  B. 


Figure  1.  BodyMedia  SenseWear  Pro2  armband 
worn  by  test  participant. 


1  SenseWear  and  BodyMedia  are  registered  trademarks  of  BodyMedia,  Inc. 
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Figure  2.  Metallic  sensors  on  the  underside  of  the  armband. 


2.3  Salivary  Amylase 

Another  method  to  gather  stress  data,  and  one  that  is  not  subject  to  human  bias,  is  the  salivary 
amylase  test.  This  is  a  very  objective  test  that  measures  the  amount  of  amylase  found  in  human 
saliva.  Amylase  is  an  enzyme  used  to  hydrolyze  or  break  down  starch  molecules  in  the  body. 
The  levels  of  amylase  in  the  body  have  been  used  as  an  effective  measure  of  stress,  including 
social  stress  such  as  performance  in  front  of  an  audience  (Nater,  La  Marca,  Florin,  Moses, 
Langhans,  Koller,  &  Ehlert,  2006;  Rohleder,  Wolf,  Maldonado,  &  Kirschbaum,  2006;  Gordis, 
Granger,  Susman,  &  Trickett,  2006;  Nater,  Rohleder,  Gaab,  Berger,  Jud,  Kirschbaum,  &  Ehlert, 
2005;  Rohleder,  Natar,  Wolf,  Ehlert,  &  Kirschbaum,  2004),  testing  (Yamaguchi,  Kanemori, 
Kanemaru,  Takai,  Mizuno,  &  Yoshida,  2004),  competition  (Kivlighan  &  Granger,  2006),  and 
physical  stress  (Wetherell,  Crown,  Lightman,  Miles,  Kaye,  &  Vedhara,  2006;  Chatterton,  Vogel- 
song,  Lu,  Ellman,  &  Hudgens,  1996).  Although  objective,  accurate,  and  repeatable,  this  method 
is  time  consuming,  intrusive  (experimentally  as  well  as  human  intrusive),  and  costly  and  requires 
specialized  laboratory  equipment  to  analyze  the  saliva  samples. 


3.  Experiments  Using  Both  Survey  and  GSR  Methods 


The  first  step  in  this  evolutionary  process  of  stress  measurement  comparison  was  to  collect  actual 
GSR  data  in  the  field  via  the  armbands.  This  was  accomplished  to  good  effect  and  is  discussed  in 
the  section  3.1.  The  second  step  was  to  collect  GSR  data  along  with  survey  data  during  the  same 
experimental  conditions  (sections  3.2  and  3.3).  The  third  step  was  to  collect  data  using  all  three 
methods  (survey,  GSR,  and  salivary  amylase). 
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3.1  Head-Tracked  Sensor  Suite  Evaluation 


3.1.1  Introduction 

The  head-tracked  sensor  suite  (HTSS)  is  a  complement  of  optical,  tracking,  and  display  systems 
designed  to  provide  vehicle  commanders  with  the  ability  to  visually  scan  a  360-  by  90-degree 
hemisphere  surrounding  their  vehicle  in  a  closed  hatch  environment.  The  system  also  enables  the 
vehicle  commander  to  see  beyond  the  immediate  area  for  target  and  terrain  detection,  recognition, 
and  identification  in  daytime  and  nighttime  conditions,  through  the  use  of  electronic  imagery 
created  by  the  fusion  of  forward-looking  infrared  and  image  intensification  technology  (see 
appendix  C). 

The  objective  of  this  study  was  to  acquire  preliminary  user  perfonnance  data  and  subjective  user 
evaluations  and  to  gain  insights  from  the  early  prototype  HTSS  system  for  use  in  the  development 
of  the  HTSS,  version  2.  Specifically,  data  were  collected  to  detennine  the  practicability  of  using 
the  HTSS  as  a  means  to  increase  situation  awareness  (SA),  reduce  workload  and  stress,  and 
enhance  the  vehicle  commander’s  ability  to  detect,  recognize,  and  identify  (DRI)  targets  and 
terrain  while  moving  in  daytime  and  nighttime  conditions. 

3.1.2  Method 

The  original  intent  was  to  evaluate  the  HTSS  with  an  M1A2  Abrams  tank  as  the  test  vehicle 
platform;  however,  maintenance  problems  with  the  tank  procured  for  this  purpose  precluded  that 
intent.  The  alternate  platfonn  for  this  evaluation,  an  Ml  043  truck,  was  used  in  lieu  of  the  M1A2 
tank.  The  Ml 043  truck  is  an  M998  high-mobility  multipurpose  wheeled  vehicle  (HMMWV)  in 
the  armament  carrier  (without  winch)  configuration.  For  this  reason,  certain  aspects  of  the  HTSS 
were  unable  to  be  evaluated  during  the  study.  Efforts  were  instead  focused  on  the  use  of  the 
system  in  target  DRI.  Participants,  who  were  experienced  armor  crewmen,  moved  through  a 
military  operations  on  urbanized  terrain  (MOUT)  and  movement  route  to  conduct  various 
reconnaissance  and  surveillance  tasks. 

Target  detection,  recognition,  and  identification  were  defined  as  follows.  Detection  was  defined 
as  the  point  at  which  participants  perceived  an  object  on  the  screen  that  stood  out  from  the  environ¬ 
ment.  Detection  was  usually  denoted  by  the  participant  stating  something  to  the  effect,  “I  see 
something  at  the  tree  line.”  Recognition  was  defined  as  the  point  at  which  participants  were  able 
to  determine  what  the  object  was  in  general  terms.  For  example,  recognition  occurred  when 
participants  were  able  to  detennine  that  an  object  was  a  wheeled  vehicle  as  opposed  to  a  shed, 
for  example.  Identification  was  defined  as  the  point  at  which  participants  were  able  to  correctly 
identify  an  object.  That  is,  identification  occurred  when  participants  were  able  to  detennine  that 
the  object  was,  for  instance,  a  “deuce  and  a  half’  instead  of  a  5 -ton  truck. 
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3.1.3  Results 


Figure  3  shows  the  comparison  of  GSR  values  for  MOUT  and  movement  environments  across 
each  condition.  Results  indicate  that  GSR  levels  overall  were  less  during  the  movement  trials  than 
during  the  MOUT  trials.  This  may  suggest  that  the  shorter  route,  more  confined  space,  and  un¬ 
predictable  nature  of  the  targets  in  the  MOUT  environment  were  more  stressful  to  the  participants 
(using  both  the  HTSS  and  night  vision  goggles  [NVGs])  than  the  longer,  less  eventful  scenario 
experienced  in  the  movement  environment.  This  seems  to  make  sense  and  shows  that  the  GSR 
values  are  representative  of  the  events  being  measured.  In  both  MOUT  and  movement  environ¬ 
ments,  GSR  levels  were  highest  when  the  HTSS  was  used,  except  for  the  night-HTSS  condition. 
This  could  be  caused  by  the  unfamiliarity  with  the  HTSS  system  in  a  very  familiar  environment 
(target  DRI,  sector  scanning,  etc.).  A  larger  sample  size  and  more  trials  may  yield  lower  GSR 
levels  while  increasing  statistical  power. 

GSR  levels  in  all  conditions  were  lower  than  the  baseline  GSR  levels.  This  may  suggest  that  base¬ 
line  levels  were  derived  when  anticipation  levels  were  highest  (before  the  event  started),  and  levels 
dropped  after  participants  settled  into  the  job  of  performing  specific  tasks  during  each  trial. 


3.2  Aided  Target  Recognition  Experiment 
3.2.1  Introduction 

Future  scouts  will  have  many  simultaneous  tasks  with  which  to  contend.  They  will  be  required  to 
maintain  overall  SA  using  a  common  operational  picture;  receive  instructions  from  and  provide 
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information  to  higher  headquarters;  plan  and  adjust  routes  for  manned  and  unmanned  vehicles; 
monitor  sensor  locations;  receive  information  from  multiple  sensors,  synthesize  that  information, 
and  provide  actionable  data  to  those  who  need  it;  and  maintain  local  SA. 

Because  the  scout  must  perfonn  several  ongoing  tasks,  sufficient  time  will  not  be  available  to 
simply  focus  on  continuing  sensor  imagery.  Furthermore,  the  battle  space  of  the  scout  may  be 
complex,  with  many  objects  that  could  be  taken  for  targets.  Thus  effective  aided  target  recognition 
(AiTR)  technology  is  critical  to  reducing  scout  workload  and  enabling  scouts  to  perform  their  jobs 
more  effectively. 

Several  studies  have  demonstrated  that  AiTR  improves  target  identification.  McDowell  (1992) 
showed  that  perfonnance  with  AiTR  was  better  than  unaided  performance  when  AiTR  was  40% 
and  80%  reliable.  Similarly,  Entin,  Entin,  and  MacMillan  (1994)  demonstrated  that  AiTR  at  80% 
accuracy  increased  hits  in  target  recognition  over  unaided  target  recognition  without  increasing 
false  alarm  rates.  Kibbe  and  Weisgerber  (1991)  showed  that  AiTR  of  70%  and  90%  accuracy 
improved  target  recognition  over  unaided  performance,  but  AiTR  of  50%  accuracy  did  not. 

The  AiTR  technology  considered  in  this  study  was  not  simply  the  sensor  and  the  algorithms  used 
but  the  entire  Soldier-system  interface.  This  included  controls  such  as  a  mouse,  joystick,  and 
buttons.  This  also  included  displays  that  provided  the  Soldier  with  software  menus,  streaming 
imagery,  digital  maps,  representations  of  targets  on  the  terrain,  and  other  features. 

3.2.2  Method 

Participants  were  seven  experienced  scouts  (rank  of  Sergeant  E5  or  Major  04).  The  participants 
were  recruited  and  trained  in  the  use  of  the  interface  by  subject  matter  experts  working  with  the 
Night  Vision  and  Electronic  Sensors  Directorate  (NVESD)  of  ARL  on  this  project.  Since  the 
interface  involved  only  a  few  controls  and  functions,  roughly  1  hour  of  training  before  the 
experiment  was  sufficient  for  test  participants  to  be  able  to  operate  the  system. 

The  interface  consisted  of  two  computer  screens,  a  joystick  control  unit,  a  mouse,  and  a  keyboard, 
in  the  rear  of  a  HMMWV.  The  computer  screen  to  the  right  of  the  scout  provided  a  digital  map  of 
the  battlefield  and  was  referred  to  as  the  situation  awareness  screen  or  “SA  screen”  (see  appendix  D). 
The  AiTR  provided  Soldiers  with  the  ability  to  populate  the  SA  screen  with  “lased2”  targets.  The 
computer  screen  directly  in  front  of  the  scout,  referred  to  as  the  crew  station  screen,  provided  all 
sensor  feed  imagery  and  was  split  into  different  sections;  the  top  half  could  show  a  live  view  of  a 
specific  part  of  the  terrain  chosen  by  the  scout  when  in  stare  mode  or  a  selected  static  view  from  the 
gimbal  scan  mode,  which  was  updated  every  6  seconds.  Symbols  (color-coded  brackets)  for  targets 
detected  in  the  entire  area  selected  for  surveillance  were  displayed  in  three  locations:  a)  within  the 
image  chips  described  next,  b)  in  the  top  half  of  the  screen  where  live  and  static  imagery  was 
displayed,  and  c)  in  the  panoramic  view  that  was  displayed  at  the  bottom  of  the  screen. 
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Lase  means  “to  emit  coherent  light  at”. 
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When  AiTR  was  activated,  as  many  as  ten  small  pictures  of  potential  targets  (called  chips)  were 
displayed  from  left  to  right  in  reference  to  their  locations  in  the  top  and  bottom  screens  just 
described.  Algorithms  assigned  a  confidence  to  target  reports  coming  from  AiTR  boxes.  The 
confidence  comes  from  how  target-like  the  detection  is,  based  on  measured  features.  The  user 
could  manually  set  a  threshold  of  confidence  for  target  detection.  If  the  user  sets  a  high  threshold, 
few  detections  will  be  made  and  the  likelihood  of  the  detections  being  actual  targets  will  be  high. 
Conversely,  if  the  user  sets  a  low  threshold,  more  detections  will  be  made,  but  the  chances  of  a 
detection  being  an  actual  target  will  be  lower.  When  more  than  ten  targets  that  meet  the  set  thres¬ 
hold  have  been  detected,  the  first  detections  drop  off  the  crew  station  screen.  Within  the  AiTR 
mode,  stationary  target  indication  (STI)  or  moving  target  indication  (MTI)  could  be  selected.  The 
STI  mode  elicited  a  higher  rate  of  false  positives  (e.g.,  hot  spots  caused  by  roofs  on  buildings). 

The  MTI  mode  was  much  more  reliable  and  had  a  false  alarm  rate  of  one  to  two  orders  of 
magnitude  below  STI  but  missed  stationary  targets.  A  scout  could  choose  to  use  AiTR  on  a 
selected  portion  of  an  area  so  that,  for  example,  a  highway  that  contained  much  civilian  traffic 
could  be  ignored. 

The  joystick  unit  controlled  the  movement  and  zoom  function  of  the  sensor  in  manual  mode. 
Buttons  on  the  joystick  were  also  available  on  the  screen  and  manipulated  via  the  mouse.  These 
buttons  controlled  sensor  gain  (contrast),  level  (brightness),  and  polarity  (white  hot  versus  black 
hot),  pan,  focus,  wide  and  narrow  field  of  views,  two  electronic  zooms,  and  manual  control  of  the 
sensor.  Appendix  D  provides  illustrations  of  the  crew  station  and  the  joystick  control. 

The  demonstration  itself  was  organized,  conducted,  and  controlled  by  NVESD.  ARL  researcher 
responsibility  was  the  collection  of  data,  as  described  in  this  report.  The  study  involved  five  sce¬ 
narios,  including  but  not  limited  to  watching  for  suspicious  activity  along  a  highway,  watching  for 
suspicious  activity  around  an  airport  (reflects  MOUT),  observing  activity  at  an  Army  installation 
gate  (reflects  a  check  point),  observing  activity  along  a  “border”  (reflects  border  patrol  military 
operations),  and  observing  open  terrain.  The  scenarios  occurred  during  day  and  night.  Soldiers 
could  choose  whether  to  use  AiTR  during  the  scenarios.  In  a  field  test,  however,  it  was  not 
possible  to  counter  balance  the  use  of  AiTR,  scenario,  and  time  of  day  for  all  scenarios.  An 
example  of  the  pseudo  counterbalanced  order  is  given  in  table  1 . 


Table  1 .  Counterbalanced  scensrios  and  daylight  conditions. 


Day 

Night 

Day 

Night 

Day 

Night 

Highway 

Airport 

Check  Point 

Border 

Open  Terrain 

Other 

Airport 

Check  Point 

Border 

Open  Terrain 

Other 

Highway 

Check  Point 

Border 

Open  Terrain 

Other 

Highway 

Airport 

Border 

Open  Terrain 

Other 

Highway 

Airport 

Checkpoint 

Open  Terrain 

Other 

Highway 

Airport 

Check  Point 

Border 

Other 

Highway 

Airport 

Check  Point 

Border 

Open  Terrain 

Data  on  workload  and  stress  were  collected  multiple  times  during  each  scenario  (day,  night,  AiTR 
activated,  AiTR  de-activated). 
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3.2.3  Results 

All  graphs  showing  combined  GSR  and  survey  stress  data  use  two  separate  scales.  GSR  data  are 
measured  in  micro-Seimens,  and  survey  data  are  measured  on  a  subjective  rating  scale  from  1  to 
10,  with  1  being  low  stress  and  10  being  high  stress.  Generally,  stress  was  low  (subjective  ratings) 
to  moderate  (GSR).  Subjective  survey  stress  was  highest  for  the  airport  scenario,  perhaps  because 
of  the  complexity  of  the  environment  in  terms  of  activity  and  distance  to  be  covered  (figure  4). 
Stress  (both  survey  and  GSR)  were  somewhat  higher  at  night  (figure  5),  which  suggests  that 
identifying  targets  from  only  a  thermal  signature  and  the  inability  to  use  terrain  features  available 
during  daylight  may  be  more  challenging.  However,  night  scenarios  may  be  less  stressful  for 
scouts  who  have  more  experience  using  thermal  imagery  at  night.  The  GSR  and  survey  measures 
of  stress  by  use  of  AiTR  are  presented  in  figures  6  and  7,  respectively.  Stress  measures  suggest 
that  intermittent  use  of  AiTR  results  in  greater  stress  than  not  using  AiTR,  perhaps  because  of  the 
necessity  of  constantly  switching  modes  and  the  effects  of  re-establishing  SA,  based  on  the 
features  of  each  mode  (i.e.,  re-familiarizing  oneself  with  image  chips). 


Stress  by  Scenario 


Figure  4.  AiTR  experiment  GSR  and  survey  stress  by  scenario. 
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Stress  by  Time  of  Day 
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Figure  5.  AiTR  experiment  GSR  and  survey  stress  by  time  of  day. 


GSR  by  Use  of  AiTR 


Figure  6.  AiTR  experiment  GSR  stress  by  use  of  AiTR. 
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Stress  by  Use  of  AiTR 


Figure  7.  AiTR  experiment  survey  stress  by  use  of  AiTR. 


3.3  Crew-Aiding  Behavior  and  Lethality  Experiment 
3.3.1  Introduction 

As  the  U.S.  Army  network-centric  digital  battlefields  continue  to  expand,  so  do  the  workload 
demands  placed  on  Soldiers  who  use  the  increasing  amount  of  information  to  conduct  their  mis¬ 
sions.  In  an  effort  to  reduce  workload  and  stress  for  these  Soldiers,  decision  aids,  called  crew- 
aiding  behaviors  (CABs),  have  been  developed  which  provide  a  level  of  automation  designed  to 
assist  Soldiers  in  the  performance  of  their  tasks.  A  field-based  experiment  was  conducted  to  assess 
the  effects  of  these  decision  aids  on  Soldier  performance  in  a  simulated  battlefield  environment. 

We  evaluated  the  effects  of  the  CABs  by  measuring  and  comparing  levels  of  task  time,  workload, 
stress,  and  SA  between  two  experimental  conditions.  The  experimental  task  was  target  prioritiza¬ 
tion,  weapon  system  and  munition  matching,  and  target  engagement  with  and  without  the  use  of 
the  decision  aids. 

This  experiment,  known  as  the  Lethality  Experiment,  was  one  of  several  experiments  conducted 
under  the  name  of  the  U.S.  Army  Research  Development  and  Engineering  Command  (RDECOM)- 
Unit  of  Action  Maneuver  Battle  Lab  (UAMBL)  Experiment  Fiscal  Year  2006  (RUX06).  These 
experiments  were  conducted  jointly  among  RDECOM,  specifically,  ARL’s  HRED;  Tank  Auto¬ 
motive  Research  Development  and  Engineering  Center;  Aviation  and  Missile  Research  Develop¬ 
ment  and  Engineering  Center  (AMRDEC);  and  UAMBL  in  support  of  the  Crew-integration  and 
Automation  Test  Bed  Advanced  Technology  Demonstration  program  (CAT-ATD).  Experimen¬ 
tation  was  conducted  at  Fort  Knox,  Kentucky,  in  July  2006. 
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The  objective  of  this  research  was  to  determine  the  impact  of  CABs  on  Soldier  workload,  stress, 
SA,  and  performance.  Specifically,  this  experiment  examined  the  effectiveness  of  CABs  designed 
to  prioritize  targets  (based  on  threat  level  and  proximity)  and  to  provide  weapons  platform  and 
munition  recommendations  to  service  each  target. 

3.3.2  Method 

This  experiment  took  place  entirely  in  simulation;  however,  the  crew  station  was  identical  to  that 
used  in  the  actual  field  vehicle.  The  SIL  interface  (figure  8)  consisted  of  three  vertically  oriented 
liquid  crystal  displays  situated  in  an  arc  in  front  of  a  seated  participant.  Each  display  was  divided 
in  two,  horizontally,  with  information  on  each  of  the  six  “screens”  being  provided  from  various 
computer  systems,  which  were  transparent  to  the  SIL  operation  and  the  participant.  Figure  9 
shows  the  basic  layout  of  the  three  displays  (six  screens)  used  during  this  experiment,  with  the 
target  prioritization  list  on  the  center  display.  Participants  could  select  targets  and  weapons  by 
touching  on-screen  buttons  or  by  scrolling  through  the  list  using  a  thumb  button  on  the  driver’s 
yoke.  The  yoke  was  also  used  to  slew  the  weapon  system  and  to  engage  each  target.  Detailed 
information  regarding  each  screen  and  button  functions  is  available  in  appendix  E. 


Figure  8.  CAT  SIL  crew  station  simulator. 


Twelve  active  duty  male  Soldiers  volunteered  for  this  experiment.  One  Soldier  was  a  Captain 
(03),  seven  Soldiers  were  Sergeants  First  Class  (E7),  and  four  Soldiers  were  Staff  Sergeants  (E6). 
Military  occupational  specialties  were  primarily  Ml  Armor  Crewmen  (19K).  Nine  participants 
were  19K,  one  19D  (Cavalry  Scout),  one  14E  (Patriot  Fire  Control  Enhanced  Operator),  and  one 
25B  (Information  Systems  Operator- Analyst). 
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Figure  9.  CAT  SIL  display  layout. 


Participants  were  given  a  1-hour  block  of  instruction  and  practice  for  the  task  of  prioritizing  and 
servicing  a  list  of  targets,  in  both  CAB  and  NoCAB  conditions.  The  instruction  consisted  of 
familiarization  with  the  displays  and  controls  and  a  detailed  explanation  of  the  tasks,  conditions, 
and  standards  for  the  experiment.  Depending  on  which  condition  was  presented  first,  training  for 
that  condition  was  presented  before  experimentation.  For  example,  if  a  participant  was  testing  in 
the  CAB  condition  first,  the  CAB  training  was  conducted  before  testing  in  the  CAB  condition. 
Following  testing  and  a  short  break,  the  NoCAB  training  was  conducted  before  the  NoCAB  test. 

Subjective  stress  measurements  were  collected  with  one-item  rating  scales  that  measure  physical 
stress  and  mental  stress.  These  measures  are  based  on  each  participant’s  subjective  assessment  of 
his  own  perceived  levels  of  stress  within  a  given  experimental  condition  or  session.  Objective 
stress  measurements  were  collected  via  the  SenseWear  Pro2  armband. 


3.3.3  Results 

Physical  stress  was  only  slightly  non-significant.  Even  though  no  significance  was  observed,  the 
general  trend  of  increasing  stress  (both  mental  and  physical)  may  be  seen  between  CAB  and 
NoCAB  (figure  10).  That  is,  mental  and  physical  stress  are  higher  in  the  NoCAB  condition  than  in 
the  CAB  or  Baseline  conditions.  Although  not  statistically  significant,  it  is  believed  this  difference 
would  become  so  with  a  larger  sample  size.  As  can  be  seen  in  figure  11,  GSR  results  generally 
parallel  the  results  of  the  subjective  ratings,  with  stress  in  the  Baseline  and  CAB  conditions  being 
equivalent  and  stress  in  the  NoCAB  condition  being  higher. 
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A  correlation  analysis  conducted  for  this  study  was  used  to  determine  if  a  correlation  existed 
between  the  two  different  types  of  stress  data  and  not  to  support  a  specific  hypothesis  regarding  the 
two  methods  of  data  collection  used  to  obtain  the  data. 

No  significant  correlation  existed  between  the  subjective  stress  data  collected  by  survey  and  the 
objective  GSR  stress  data  collected  by  the  armband.  However,  data  from  both  methods  show  a 
distinct  trend  of  increasing  stress  in  the  NoCAB  condition  (versus  Baseline  and  CAB  conditions). 

In  summary,  the  analyses  of  variance  (ANOVAs)  show  that  the  higher  levels  of  stress  observed  in 
the  NoCAB  condition,  in  the  survey  and  GSR  data,  were  not  significant.  Further,  there  was  no 
correlation  between  stress  measurement  methods. 


Lethality  Experiment  Average  Stress  Levels 
(subjective  rating) 
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Figure  10.  Comparison  of  physical  and  mental  stress  across  conditions. 


First,  the  non-significant  ANOVA  results  may  be  entirely  attributable  to  the  small  sample  size. 
The  graphs  in  figures  10  and  1 1  clearly  show  the  trend  of  increasing  stress  in  the  NoCAB 
condition,  compared  with  the  Baseline  and  CAB  conditions.  Further,  the  graphs  show  that  the 
stress  levels  in  the  CAB  condition  are  more  closely  aligned  with  the  Baseline  condition.  It  is 
believed  that  a  larger  sample  size  would  demonstrate  that  this  trend  toward  greater  stress  in  the 
NoCAB  condition  would  be  statistically  significant. 


Second,  the  fact  that  no  correlation  exists  between  the  stress  data  captured  by  the  two  different 
methods  may  only  illustrate  that  again,  the  small  sample  size  was  simply  too  small  to  achieve 
statistically  significant  results.  As  previously  stated,  both  sets  of  data  clearly  show  the  trend  of 
increased  levels  of  stress  in  the  NoCAB  condition  when  compared  with  the  Baseline  and  CAB 
conditions. 
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Lethality  Experiment  Average  Stress  Levels 
(objective  GSR) 
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Figure  11.  Comparison  of  GSR  stress  data  across  conditions. 


4.  Overall  Assessment/Comparison  Between  Methods 


This  research  suggests  that  GSR  is  a  promising  measure  of  stress.  Although  there  was  no  subjec¬ 
tive  measure  of  stress  with  which  to  compare  GSR  in  the  HTSS  experiment,  the  GSR  results 
seemed  to  make  sense.  The  confined  space  and  unpredictable  nature  of  the  targets  in  the  MOUT 
environment  was  more  stressful  to  the  participants  (with  both  the  HTSS  and  NVGs)  than  the  less 
eventful  and  more  familiar  scenario  experienced  in  the  movement  environment.  Further,  stress 
using  an  unfamiliar  technology  to  identify  targets  (HTSS)  was  higher  than  stress  identifying  targets 
with  the  traditional  “head-out-of-the-hatch”  view. 

In  the  AiTR  experiment,  there  was  reasonable  correspondence  between  subjective  stress  measures 
and  GSR.  Survey  and  GSR  results  were  somewhat  inconsistent  in  that  the  airport  scenario  was 
highest  for  the  survey  measure,  but  the  airport,  highway,  and  open  scenarios  were  equally  stressful 
as  measured  by  GSR.  The  survey  and  GSR  measures  were  all  higher  for  night  versus  day 
scenarios.  Also,  the  survey  and  GSR  measures  were  consistent  in  that  both  were  higher  for 
intermittent  use  of  AiTR  versus  constant  use  of  AiTR. 

In  the  targeting  experiment,  the  survey  and  GSR  measures  of  stress  were  consistent  in  that  both 
measures  showed  that  the  Baseline  and  CAB  conditions  were  about  equally  stressful,  while  the 
NoCAB  condition,  as  might  be  expected,  was  the  most  stressful. 
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4.1  Caveats  in  the  Current  Report/Study 

Of  course,  all  the  research  efforts  just  described  have  drawbacks  in  that  relatively  few  participants 
were  used.  The  AiTR  study  had  further  problems  since  there  was  no  control  or  ability  to 
counterbalance  conditions;  thus,  no  inferential  statistics  were  possible.  More  controlled  research 
with  more  participants  is  necessary  for  a  more  robust  comparison  among  these  methods. 


5.  Recommendations  for  Further  Research 


To  measure  Soldier  stress  levels  during  training  exercises  or  during  scientific  experimentation,  it 
is  desirable  to  have  a  fast,  reliable,  objective,  and  non-intrusive  method  for  collecting  stress  data. 
Surveys  are  subjective  and  rely  entirely  on  biased  self-reporting.  Salivary  amylase  is  objective 
and  reliable,  but  it  is  neither  fast  nor  non-invasive.  Using  a  bio-instrumentation  device  to  collect 
GSR  data  appears  to  offer  the  better  solution,  since  it  is  non-invasive,  reliable,  expeditious,  and 
collects  data  continuously  throughout  a  training  mission  or  experimental  session.  This  makes  it 
ideal  for  researchers  in  the  field. 

It  is  suggested  that  to  determine  which  is  the  better  solution  in  terms  of  time  to  collect  data, 
accuracy,  invasiveness,  and  preference  (both  experimenter  and  participant),  a  study  be  conducted 
with  all  three  methods  of  data  collection:  subjective  surveys,  objective  GSR,  and  objective 
salivary  amylase.  It  is  hypothesized  that  the  objective  methods  will  exhibit  high  positive 
correlations  with  each  other,  thereby  demonstrating  that  GSR  is  an  acceptable  method  (at  least 
when  compared  with  salivary  amylase)  for  collecting  stress  data  in  the  field.  Further,  it  is 
suggested  the  subjective  method  data  will  not  correlate  with  data  from  either  objective  method, 
although  it  is  expected  to  exhibit  similar  trends. 
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Appendix  A.  Stress  Survey 


Participant  ID: _  Date: _  Time: _  Experiment: _ Condition: _ 

Subjective  Stress  Rating  Scale 

a.  The  scale  below  represents  a  range  of  how  PHYSICALLY  stressful  the  mission  might  be. 
Check  the  block  indicating  how  PHYSICALLY  stressful  the  mission  you  just 
participated  in  was. 


Task 

Not  at 
All 

Stressful 

1 

2 

3 

4 

5 

6 

7 

8 

9 

Most 

Possible 

Stress 

10 

a.  Overall  stress 

a.  The  scale  below  represents  a  range  of  how  MENTALLY  stressful  the  mission  might  be. 
Check  the  block  indicating  how  MENTALLY  stressful  the  mission  that  you  just 
participated  in  was. 


Task 

Not  at 
All 

Stressful 

1 

2 

3 

4 

5 

6 

7 

8 

9 

Most 

Possible 

Stress 

10 

a.  Overall  stress 
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Appendix  B.  Armband  Specifics 


Manufacturer’s  Data  Sheet  for  the  SenseWear  Pro2  Armband 


SenseWear  '(PRO] 


Product  Features 

The  SenseWear7" Pro  2  Armband  is  a  sleek,  wirWess,  wearable  body  monitor  that 
enables  continuous  physiological  and  lifestyle  data  collection  outside  the  lab  environ¬ 
ment.  Worn  on  the  back  of  the  upper  right  arm,  it  utilizes  a  unique  combination  of 
sensors  and  technologies  that 

• Gather  raw  physiological  data  such  as  movement,  heat  flow,  skin  temperature,  ambi¬ 
ent  temperature,  and  galvanic  skin  response. 

•  Can  be  worn  up  to  14  days  continuously  without  changing  the  battery. 

«  Stores  up  to  14  days  of  continuous  physiological  and  lifestyle  data. 

•  Allows  research  subjects  to  timestamp  specific  events 

•  Compatible  with  our  innerView7"  Research  Software.** 

•  Offers  audio  and  tactile  feedback  for  reminders  and  alerts  ** 

•  Enables  2-way  communication,  making  the  Armband  a  hub  for  collecting  data  from 
other  third-party  products  such  as  a  weight  scale  or  Wood  pressure  cuff*** 

•  Eliminates  the  need  for  researchers  and  clinicians  to  administer  and  apply 
cumbersome  sensors  to  their  research  subjects 
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Appendix  C.  HTSS  Components 


HTSS  gimbaled  sensor  platform  mounted  to  test  vehicle 


Processor 


Controls 


Helmet 

unit 


HTSS  components  (inside  an  M1A2  tank) 
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Appendix  D.  AiTR  User  Interface 


AiTR  crew  station 
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Appendix  E.  CAT  SIL  Screen  and  Button  Functions 


(from  AMRDEC  training  slides) 


Target  Acquisition  Display  -  Yoke  Controls 


Magnification 

•  Increase  /  decrease  the  TA  sensor 
magnification  from  0.5X  to  24X. 


Slew 

•  Slew  TA  sensor  left  /  right 

•  Tilt  TA  sensor  up  /  down 


Fight  Suite  -  Target  Queue 


Reports 


Target  Acquisition 


Planning 


Assets 


Target  Queue 


Planning 
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